<!DOCTYPE group PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<group>

%tableofcontents;

<p>This short tutorial shows how to
compute <b><a href="%dox:fisher;">Fisher vector</a></b> and
<b><a href="%dox:vlad;">VLAD</a></b> encodings with VLFeat MATLAB
interface.</p>

<p>These encoding serve a similar purposes: summarizing in a vectorial
statistic a number of local feature descriptors
(e.g. <a href="%dox:sift;">SIFT</a>). Similarly to bag of visual
words, they assign local descriptor to elements in a visual
dictionary, obtained with vector quantization (KMeans) in the case of
VLAD or a <a href="%dox:gmm;">Gaussian Mixture Models</a> for Fisher
Vectors. However, rather than storing visual word occurrences only,
these representations store a statistics of the difference between
dictionary elements and pooled local features.</p>

<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
<h1 id="tut.fisher">Fisher encoding</h1>
<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->

<p>The Fisher encoding uses GMM to construct a visual word
dictionary. To exemplify constructing a GMM, consider a number of 2
dimensional data points (see also the <a href="%pathto:tut.gmm;">GMM
tutorial</a>). In practice, these points would be a collection of SIFT
or other local image features. The following code fits a GMM to the
points:</p>

<precode type='matlab'>
numFeatures = 5000 ;
dimension = 2 ;
data = rand(dimension,numFeatures) ;

numClusters = 30 ;
[means, covariances, priors] = vl_gmm(data, numClusters);
</precode>

<p>Next, we create another random set of vectors, which should be
encoded using the Fisher Vector representation and the GMM just
obtained:</p>

<precode type='matlab'>
numDataToBeEncoded = 1000;
dataToBeEncoded = rand(dimension,numDataToBeEncoded);
</precode>

<p>The Fisher vector encoding <code>enc</code> of these vectors is
obtained by calling the <code>vl_fisher</code> function using the
output of the <code>vl_gmm</code> function:</p>

<precode type='matlab'>
encoding = vl_fisher(datatoBeEncoded, means, covariances, priors);
</precode>

<p>The <code>encoding</code> vector is the Fisher vector
representation of the data <code>dataToBeEncoded</code>.</p>

<p>Note that Fisher Vectors support
several <a href="%dox:fisher-normalization;">normalization options</a>
that can affect substantially the performance of the
representation.</p>

<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
<h1 id="tut.vlad">VLAD encoding</h1>
<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->

<p>The <b>V</b>ector
of <b>L</b>inearly <b>A</b>gregated <b>D</b>escriptors is similar to
Fisher vectors but (i) it does not store second-order information
about the features and (ii) it typically use KMeans instead of GMMs to
generate the feature vocabulary (although the latter is also an
option).</p>

<p>Consider the same 2D data matrix <code>data</code> used in the
previous section to train the Fisher vector representation. To compute
VLAD, we first need to obtain a visual word dictionary. This time, we
use K-means:</p>

<precode type='matlab'>
numClusters = 30 ;
centers = vl_kmeans(dataLearn, numClusters);
</precode>

<p>Now consider the data <code>dataToBeEncoded</code> and use
the <code>vl_vlad</code> function to compute the encoding. Differently
from <code>vl_fisher</code>, <code>vl_vlad</code> requires the
data-to-cluster assignments to be passed in. This allows using a fast
vector quantization technique (e.g. kd-tree) as well as switching from
soft to hard assignment.</p>

<p>In this example, we use a kd-tree for quantization:</p>

<precode type='matlab'>
kdtree = vl_kdtreebuild(centers) ;
nn = vl_kdtreequery(kdtree, centers, dataEncode) ;
</precode>

<p>Now we have in the <code>nn</code> the indexes of the nearest
center to each vector in the matrix <code>dataToBeEncoded</code>. The
next step is to create an assignment matrix:</p>

<precode>
assignments = zeros(numClusters,numDataToBeEncoded);
assignments(sub2ind(size(assignments), nn, 1:length(nn))) = 1;
</precode>

<p>It is now possible to encode the data using
the <code>vl_vlad</code> function:</p>

<precode>
enc = vl_vlad(dataToBeEncoded,centers,assignments);
</precode>

<p>Note that, similarly to Fisher vectors, VLAD supports
several <a href="%dox:vlad-normalization;">normalization options</a>
that can affect substantially the performance of the
representation.</p>

</group>




